Pesquisa | BVS - MINISTÉRIO DA SAÚDE

Publisher Correction: Large language models encode clinical knowledge.

Singhal, Karan; Azizi, Shekoofeh; Tu, Tao; Mahdavi, S Sara; Wei, Jason; Chung, Hyung Won; Scales, Nathan; Tanwani, Ajay; Cole-Lewis, Heather; Pfohl, Stephen; Payne, Perry; Seneviratne, Martin; Gamble, Paul; Kelly, Chris; Babiker, Abubakr; Schärli, Nathanael; Chowdhery, Aakanksha; Mansfield, Philip; Demner-Fushman, Dina; Agüera Y Arcas, Blaise; Webster, Dale; Corrado, Greg S; Matias, Yossi; Chou, Katherine; Gottweis, Juraj; Tomasev, Nenad; Liu, Yun; Rajkomar, Alvin; Barral, Joelle; Semturs, Christopher; Karthikesalingam, Alan; Natarajan, Vivek.

Nature ; 620(7973): E19, 2023 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-37500979

Large language models encode clinical knowledge.

Nature ; 620(7972): 172-180, 2023 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-37438534

RESUMO

Large language models (LLMs) have demonstrated impressive capabilities, but the bar for clinical applications is high. Attempts to assess the clinical knowledge of models typically rely on automated evaluations based on limited benchmarks. Here, to address these limitations, we present MultiMedQA, a benchmark combining six existing medical question answering datasets spanning professional medicine, research and consumer queries and a new dataset of medical questions searched online, HealthSearchQA. We propose a human evaluation framework for model answers along multiple axes including factuality, comprehension, reasoning, possible harm and bias. In addition, we evaluate Pathways Language Model1 (PaLM, a 540-billion parameter LLM) and its instruction-tuned variant, Flan-PaLM2 on MultiMedQA. Using a combination of prompting strategies, Flan-PaLM achieves state-of-the-art accuracy on every MultiMedQA multiple-choice dataset (MedQA3, MedMCQA4, PubMedQA5 and Measuring Massive Multitask Language Understanding (MMLU) clinical topics6), including 67.6% accuracy on MedQA (US Medical Licensing Exam-style questions), surpassing the prior state of the art by more than 17%. However, human evaluation reveals key gaps. To resolve this, we introduce instruction prompt tuning, a parameter-efficient approach for aligning LLMs to new domains using a few exemplars. The resulting model, Med-PaLM, performs encouragingly, but remains inferior to clinicians. We show that comprehension, knowledge recall and reasoning improve with model scale and instruction prompt tuning, suggesting the potential utility of LLMs in medicine. Our human evaluations reveal limitations of today's models, reinforcing the importance of both evaluation frameworks and method development in creating safe, helpful LLMs for clinical applications.

Assuntos

Benchmarking , Simulação por Computador , Conhecimento , Medicina , Processamento de Linguagem Natural , Viés , Competência Clínica , Compreensão , Conjuntos de Dados como Assunto , Licenciamento , Medicina/métodos , Medicina/normas , Segurança do Paciente , Médicos

Rewards-driven control of robot arm by decoding EEG signals.

Tanwani, Ajay Kumar; del R Millan, Jose; Billard, Aude.

Annu Int Conf IEEE Eng Med Biol Soc ; 2014: 1658-61, 2014.

Artigo em Inglês | MEDLINE | ID: mdl-25570292

RESUMO

Decoding the user intention from non-invasive EEG signals is a challenging problem. In this paper, we study the feasibility of predicting the goal for controlling the robot arm in self-paced reaching movements, i.e., spontaneous movements that do not require an external cue. Our proposed system continuously estimates the goal throughout a trial starting before the movement onset by online classification and generates optimal trajectories for driving the robot arm to the estimated goal. Experiments using EEG signals of one healthy subject (right arm) yield smooth reaching movements of the simulated 7 degrees of freedom KUKA robot arm in planar center-out reaching task with approximately 80% accuracy of reaching the actual goal.

Assuntos

Braço/fisiologia , Eletroencefalografia , Robótica , Interfaces Cérebro-Computador , Humanos , Movimento , Recompensa , Acidente Vascular Cerebral/fisiopatologia

Autonomous reinforcement learning with experience replay.

Wawrzynski, Pawel; Tanwani, Ajay Kumar.

Neural Netw ; 41: 156-67, 2013 May.

Artigo em Inglês | MEDLINE | ID: mdl-23237972

RESUMO

This paper considers the issues of efficiency and autonomy that are required to make reinforcement learning suitable for real-life control tasks. A real-time reinforcement learning algorithm is presented that repeatedly adjusts the control policy with the use of previously collected samples, and autonomously estimates the appropriate step-sizes for the learning updates. The algorithm is based on the actor-critic with experience replay whose step-sizes are determined on-line by an enhanced fixed point algorithm for on-line neural network training. An experimental study with simulated octopus arm and half-cheetah demonstrates the feasibility of the proposed algorithm to solve difficult learning control problems in an autonomous way within reasonably short time.

Assuntos

Algoritmos , Inteligência Artificial , Modelos Teóricos , Redes Neurais de Computação , Reforço Psicológico , Animais , Simulação por Computador , Cadeias de Markov , Movimento , Octopodiformes , Resolução de Problemas , Processos Estocásticos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA